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ABSTRACT 


This thesis studies the estimation from interarrival and 
service time data of the mean number of customers in service 
at time t for an M/GAqueue. Two situations are considered. 
In one the parametric form of the service time distribution 
is known. In the special case in which the service time 
distribution is exponential the approximate bias and vari- 
ance of the estimate are derived and simulation is used to 
study an approximate normal confidence interval procedure. 
Simulation is also used to illustrate that assuming a wrong 
parametric model can lead to misleading results. In the 
other situation, the parametric form of the service time 
distribution is unknown and the empirical distribution of 
the service times is used in the estimate of the mean number 
of customers in service. In the case in which the customer 
arrival rate is known the distribution of the estimate is 
derived and an approximate normal confidence interval proce- 
dure is suggested. The use of the bootstrap and jackknife 
procedure to estimate variability and construct confidence 
intervals for the estimate is also studied both analytically 


and by simulation. 
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Pee NTRODUCT IO 


Pee DE OCRIPLION OF THE PROBLEM 

The application of probability theory to a wide variety 
of congestion problems has been described in many papers and 
beoks [{Refs. 1,2,3]. Results of queueing theory are 
presented in terms of component distribution functions and 
stochastic processes (renewal, Poisson, etc) that are taken 
as known; only rarely are issues addressed that arise when 
actual data is to be used as a basis for inference from the 
models; however, see Cox( 1965) [Ref. 4]. 


The concern of this thesis 1s inference problems for a 


particularly simple queueing model, the M/G/oo queue. 1B 
this model, customers arrive according to a Poisson process 
with rate nN and there are an unlimited number of 
independent servers. service times for each server are 
independent, identically distributed With “distri out ren 
Bilge tion EF. bee AG be the number of customers being 
served at time t. It is well known that if there are no 


customers being served at time O, then 


K 
Pix(tjan} = ETSI expt -m(t) gag 
where 
M(t) = \(*F(s)ds 
wpeeiger( t)—1-P(t) |Ref. 2]. Thus the distribution of X(t) is 


Poisson and is characterized by its mean M(t). 

In this thesis we will assume that the service time 
Sascribution F(t) 1s unknown and must be estimated from 
service time data and that the arrival process is Known to 
be Poisson, except possibly for its rate : We will study 
the estimation of the mean number of customers being served 
meecime t, M(t). 


Be SCOPE Of ThE. fess 

The purpose of this thesis is to study the estimate of 
the mean number of customers being served at time t for a 
M/G/o queue. This mean completely characterizes’ the 
distribution of the number of customers being served at time 
ce We will assume that the service time distribution and 
possibly the customer arrival rate are unknown and must be 
estimated from data. 

We generally divide the estimation method into two cases 
which we shall call "parametric estimation" and 
"nonparametric estimation". In the parametric estimation 
case, a particular probabilistic model is specified for the 
service time distribution and the parameters of the 
distribution are estimated. The resulting estimate of the 
survivor function is then used in the estimate of the 
expected number of customers being served at time t. In the 
nonparametric estimation method, the empirical survivor 
function is used in the estimate of the expected number of 
customers. 

In most cases, parametric assumptions concerning the 
service time distribution are difficult ’to guseie Hence 
nonparametric estimation procedure may well be preferred to 
parametric estimation when actual data is' used. However, 
the nonparametric estimates can be expected to be less 
efficient than the parametric ones. 

The thesis is organized as follows. In Chapter II, the 
transient distribution for the number of customers being 
served at time t for the M/Gfo model is described and the 
equilibrium distribution as time goes to infinity is 
obtained. In Chapter III, we study parametric estimates of 
the mean number of customers being served under several 
assumptions for service time distributions. In the special 
case in which the service time distribution is exponential 


the approximate bias and variance of the estimate are 


10 


derived and simulation is used to study an approximate 
normal confidence interval procedure. Parametric estimates 
for gamma, mixed exponential, and lognormal distributions 
are also considered. Simulation is used to study the effect 
of assuming a wrong parametric model. In Chapter IV, a 
nonparametric estimate of the mean number of customers being 
served is described. This estimate is based on the 
empirical distribution of the service times. In the case in 
which the customer arrival rate is assumed Known the 


distribution of the nonparametric estimate is derived and an 


asymptotic normal confidence interval procedure is 
suggested. The jackknife and Bootstrap methods for 
obtaining approximate confidence intervals are also 
described. The different estimators are compared by 
simulation. Chapter V describes the simulation and gives 


the results. 

In summary, this thesis studies the use of estimates of 
the service time distribution to obtain estimates of the 
mean number of customers being served for a M/GAO queue. 
Both parametric and nonparametric estimates are considered 


and compared by simulation. 


AUiL 


Il. M/G/oo QUEUE MODEL 


The M/GAo queueing model is specified by the following 
assumptions. There are infinite number of servers. 
Customers arrive for service according to a Poisson process 
with rate A. Service times are nonnegative independent 
identically distributed random variables with distribution 
UMC tone e When a customer arrives, he immediately starts 
service. 

Let X(t) represent the number of wustomers in service at 
FLmMe tc: It 1s well known that if there are no customers 
being served at time O, then 


| K 
PixX(t)=k} = eave exp[-Ap( t) ] (275 


where p(t)= \* [1-E(s)]ds: Chat. isSy.. Xige) has a Poisson 
distribution with mean A p(t) [Ref. 2]. Taking the Limeum 
t 30 in \eqQuaeien — Zee we obtain the equilibrium 
GaSe ri puU con 


— K 
lim P{X(t)=k} = LA So"C- Foo) da] expl - A {0 1-F( x)dx] (2.2) 


700 K J 
Thus, the limiting distribution of X(t) as t»>@ 1585 alee 
Poisson with mean 4m, where m= jPF( x) dx is the mean service 
time. Therefore, the distribution of the number ie. 


customers being served at time t is Poisson with mean 
ee 
M(t) =A | F(x)dx (2.3) 
Here, the distribution of the number ' § of customers being 
served at time t is characterized by value of its mean M(t). 


The value of M(t) depends upon the service time distribucien 


which is assumed unknown and must be estimated from data. 


V2 


This thesis considers the problem of estimating M(t) from 


service and interarrival time data. 


153 


Ill. PARAMETRIC Zelivial lone aes 


Ao) DESCRIE LION 

In this chapter, it will be assumed that the parametric 
form of the service time distribution is Known. In this 
case the estimation of the mean number of customers being 
served at time t, M(t), can be considered to be a function 
of the parameter estimates of the distrilpucitom In 
particular, the estimate of M(t), when a parametric form of 
the service time distribution is assumed, is denoted by 
Mpo(t), then 


M(t) =A (* F(s)ds (3.1) 


where E(t) is a survivor distribution of an assumed 
parametric form. 

In this chapter the rate of the arrival process will be 
assumed to be unknown. Maximum likelihood estimates of the 
mean interarrival times and the mean service times are used 
in the estimate of Mp(t). 

Four parametric service time distributions will be 
considered: the exponential, the gamma, the mixed 
exponential, and the lognormal distribution. In the 
exponential case, moment approximations are used to assess 
the bias of the estimate and to develop a confidence 
interval procedure based on asymptotic normality. The 
performance of the confidence interval procedure is assessed 
by simulation. 

In the remaining three parametric models, Simulation is 
used to assess the performance of the parametric estimates. 
Another source of error in using a parametric estimator is 
that the wrong parametric form may be used. The effect of 
using the (wrong) exponential model in these cases is also 


assessed by simulation. 


14 


Each simulation has 300 replications; each replication 
consists of 50 independent service times from the specified 
distribution, and 50 independent interarrival times from an 
exponential distribution. The average relative bias and the 
average relative square error of Mp(t) are used to evaluate 
the performance of the parametric estimation method. All 
simulations were carried out on an IBM 3033 computer at the 
Naval Postgraduate School using the LLRANDOMII, random 
number generating package [Ref. 6]. 


B. EXPONENTIAL SERVICE TIME 

In this section it will be assumed that the service time 
distribution is exponential; that is, F(t) = 1-EXP(-t/y ), 
where M 1S an unknown parameter and must be estimated from 
the observed data. The maximum likelihood estimate of M is 


*h customer. 


Kod = Mere A, mo the service time of the 2 
We will also assume that the rate of the Poisson arrival 
process X is unknown and must be estimated. 

peewee raratval Eimes of Che customers are denoted by y, 
Mi > 7 Yn: Since the arrival process is Poisson with rate 
ee; the interarrival times are mutually independent, 
positive random variables with the exponential distribution 
function having mean ~ The maximum likelihood estimate 
of d is ay ae For an exponential service time 
distribution, an estimate of the mean number of customers in 


service at time t for an M/GAO queue is 
A A A . 
Mp(t) =A 4 (1-expl -t/al ) (eZ) 


The estimate is aononlinear function of the estimated 
parameters, se and ee In most cases, when estimating a 
function of the estimated parameters, bias is created by the 
nonlinear relationship of the estimated parameters. 
Approximate formulas for the bias and variance of Mp( t) wast 


now be derived. 


i> 


Let 3 be the mean service time, a = 5 3 2-2 
and ot be the mean interarrival time, X= 
By assumption, X; and Y, are independent. The estimate of 

meme can be represented by a function of the parameters of 
a 8 as follows: 


A 


M(X,@) = if 8 (1-expl -t/gl ) (375 
There are no simple, exact formulas for the mean and 
variance of the quotient of two random variables. However, 


there are approximate formulas which are sometimes useful. 
The approximation can be obtained from the partial Taylor 
series expansions of M(X,) about the true means, o& andlam@ 
The expansion is 
Mm(X,8) = M(ot,@) + Bema ,a)(X-y) + seM(w,a)(8-8) 
St 3° 


\ 
Boca Pe BX et) (6 - C) ees 1B )(X =X)? 
y A 
+s 5gril™.A)(8-B)? + Rr (3.4) 


Since we assumed that the arrival process and the service 
times are independent, the covariance terms turn out to be 
zero when we take the expectation of both sides of equation 
3.4. Thus, we get 


E[M(X,@)}] = M(B) + 5 Some, var(&) + - sector a Varta 
+ Ry (3s 
where Ry converges to zero at the rate ae The variance of 
estimate is 
Var[ M(«,A)] = [ 2 -M(at,g )] 2Var(&) + tsar ay evar’) 
+ aR ( 3469) 
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with Ry fencing tO Zenomat the rate 7k An approximate 


bias term, denoted by Bo(t), can be derived immediately from 
the equation 3.5, that is, ,(t)=E[M(a,@ )-EIM(x, @)1]1. 
Subtracting Bp(t) from the parametric value to correct the 
bias, leads to the bias corrected estimate of Mo(t). 

In order to compare the two estimates, bias and 
bias-corrected, we define the following notation. Let 9, be 
the fraction of bias of Mp, (t) against the true value M(t), 
and 9, be the fraction of square error of Mp( t) against the 
square of the true value: that is, 


= 2545-7822 (oe) 


+)- 
Bee pate) 5 2 (3.8) 


where M(t)=A{ F( s)ds. 


8) 


AVERAGE RELATIVE BIAS 
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Pueuinres sae. Average Relative Bias of Mo(t) 
for Exponential with M=2 at t=l 


iby 


A simulation experiment was performed to assess. the 
performance of the estimates. In the ith replication, 50 
exponential interarrival Cimes having mean 1 and 50 


exponential service times having mean 2 were simulated and 


estimates 

A A OA a 

Mo(t) =-M(l-exp[ -t/u] ) (3.9) 
and 

A ¢ A 

Mo( t) = Mp( t) = Bo(t) (3.108 
where 

= st > IN \ > x 

Bolt) = 5 sorM(4,8)Var(&X) + 3 sesM@ A)Var(a) 
were computed. The estimated values of xy and 8 were used 
in the variance formulas. The simulation was replicated 300 


times and the average relative biases 


Q(t) = ae a orang ae ---] (3.1 


ACA 


a 300 t)— 
8, (t) Sa >_ , Mdd= Met>_, ( 3g 
300 Az \ ices 


and the average square error 


Antes ase = eee 2 Saale 

O.(%) 300 oe 7 M (t> 

ae , 82 ety = My 

Che) = See 2 eenSeee SESS] * (3.14) 
300 Ma) 


were computed. 
Results of the simulation are presented in Figures 3.1 


aticiees...2.. In Figure 3.1, the dotted line shows the average 
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Figure 3.2 Average Relative Square Error of Mp(t) 
for Exponential with uw =2 at t=l. ( n=50 , r=300 ) 


Belacive bias, 6,(¢), aswavaumetion of € for Che original 
estimate M,( t). The solid line shows O(t) for the bias 
corrected estimate M(t). This superimposed figure 
indicates that 6.(t) for the bias-corrected estimate (with 
solid line) is almost constant and is small. The bias 
estimate produces large negative value of 6,( t) Due @(t) 
approaches a limiting value as t 370. Figure 3.2 shows of 
the average relative square error 6,(t) and 6 t) plotted as 
a function of time. The dotted line gives 6,(t) and the 
solid line is BICS)- It appears from figures that the 
estimate of bias described in equation 3.5 does correct for 
the bias. However, in Figure 3.2 the bias-corrected 
estimate has a slightly higher relative square error than 
the original estimate. This higher relative square error 
could be due to correlation between the estimate itself and 
the estimate of its bias. 

Simulation was used to assess the performance of the 


following confidence interval procedure. In the if? 


ale, 


TABLE I 


COVERAGE AND LENGTH OF 100(1-&)% C.1I. FOR 
THE ORIGINAL ESTIMATE ( N = 50, R = 300 ) 
je 68% | S00 | | 90% ~~ 
trial | errr errr rece cn | ce ec ee eee n ne | ee ee eee -- 
hengehney ©. ke henge’ || Clearer: Leng Cia lGe Re 
(s.d) (sooo (s.d) 
QO 22349) Oe o7 i) Onze 2 4.00| 0. 3632 1.0@ 
12.00 ose 90. OO 
(0O.0415)117.33;)(0. 0535) | 13. 00)( 0. 0687 ae cae 
O-22357 935 1 Owziees 4.33} 0.3699 1.38 
Z Oreos le OXG, Sb. 
(0.0396)117.33)(0. OS10) | 14.67) (0. 0633) 
O. 222455 OO | On Zeon 4.33) O.S6g@e8 1.38 
3 sieves i 7220 Silos 
(0.0410) | 20. 33) (0. 0529) | 16.67 O2 G67 S jae 
Oe 117 00 | "OnZ2eou 5. 6 One eae 17ee 
+ Sleymieys| 76200 83.38 
(0.0402) | 24. 33/(0. 0519) | 20, 33)1( 02 0666) aie 
On 22077 | 3-334 02 2e44 4.87 O2 365 O..67 
S S250 Ligoe Se. 67 
(0.0409)124. 33|(0.0527) | 17-67 (Oe 0G7 7 sale ey 
OF 2769" LAs oaeaZ 2 S 3.607 WO eS eaoet O.. 338 
6 70. 00 SZ. 67 Sle SS 
(0. 0433)/18. 67/!( 0, 0558) 1) 13. 6711 0. OF 7 ere 
OF222 5 LOmon meOe Z obe 53.00] O8sGr7 On 3S 
7 69. 00 82.00 So bi 
(0.0375) /20. 33|( 0. 0483) | 15. 00] (0, C6zZG ices 
ON 2246 7,67) 0.72595 Z 3S OSes O-se 
8 at lead O74 Sd 735 S/26y 
(0.0425)120.67/( 0.0548) 116. 33 )( 0: 0704) ae as 
O22252. [11-00 C228 76 6. 00 || 023687 O. 67 
9 SS 4a7 V5 (8 7 83.67 
(0.0423) ) 25. 33|( 0.0545) | 21.33 | (O20 Ooh. 
Or 2270" Visor 205 297s 3. OOF (Ones 0:38 
10 Bio SEO Le Sis 
(0.0411)/18.67/(0. 0530) | 16. 00 | C@s06s0) mars 
OF2735>)| VOnt yy Ome se 3. JOO sery QO. 74 
Average 6S. 50 SOs GS oe, oS 
| | 0410) | 20. 734(0. 0528) |15.474( 0. 0678) | 1 ies 
replication of the simulation, 50 exponential interarrival 


times having mean 1, and 50 exponential service times having 
mean 2 were generated, and Mp( t) and Mo(t) were computed. 
The approximate variance of Mp(t) was computed for t=l using 
equation 3.6 with Rn=0. The 100(1-%)% confidence limits L 


and U were computed by 
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TABLE II 
COVERAGE AND LENGTH OF LOO( 1-0) % Cain FOR 
( N50, R= 


THE BIAS-CORRECTED ESTIMAT sco) 
) Ae 68% | 80% | 90% | 
trial | eect ttt r sss s ns | tress st rrr rss lect cc ccssccsc- 
emcee @- ke Bengthayc. R. Length iC. Rk. 
(s.d) (s.d) (s.d) 
Om7z54 |\NaG7 | (0.2879 G67 | 0.3695 DOS 
al 69.00 S067 89.00 
GOO 2 Po eres O-0535 12.67 |/(0.0687)) 8.67 
C2257 oro 7 O. ZE6Zz Be Oy | Oo 3699 rile I) 
, 62). 67 SO. 33 SOuecr 
6 e2 oo) | P6767 )( OS @510) | 14, 00)(0. 0655))}11. 00 
Uazea | LeR00) ~OeZso6 7, CHO S67 6 Ye O1®, 
3 66. 00 TGs SiS Ci 3s 
OrO4-O ) 1 1S001 (0.0529) 116. 00) 0.0678). 10. 67 
Ome. | v4733) 0. 2851 GeOOie 0. 3659 ey 
4 622-53 74.33 Sas 00 
OO 2 0 7arss oo O, O59 )119.671(0. 0666) |, 15. 33 
0.2207 |17.67| 0.2844 poe  O. 3651 iG, 
5 Seeley 74. 67 88.67 
me O409 apy 67 (-O. 0527 7116.33100706/6)) 9.67 
O22269 |Wer6/)| 0.2925 Gees | O63 754 Z..O00 
6 64. 00 Si. 33 8S..00 
CORO coa iid .3o (GO, O55e6))12,33)( 0. O7 179,10. OO 
Oe2270" i 12°67 moe Zoo >. ofan O. Sere, COO 
7 Gis 00) S0.35 So.07 
(Om Oor7 >) 167363 1( 0.0483) )14.00/(0. 0620)! 9.33 
O.2246 |14.00; 0.2895 267 | O.. 375 O67 
8 6735 80. 00 88. 00 
(OP O4Z5 ime 67 |) (020548) | 15.33)(0.0/04)i11.33 
Oae7s50 (tone) O. 2676 OOO | O7369Z eo 7 
9 60. OO 7026 / Soes35 
(COnO SZ So abZ cont ©, 0545) 120. 331(0.0670)115. 00 
O27 Omni ne, 33 || 90. 2926 eos O37 55 lo 7 
Ie, 63.00 Fis-.35 S6. OC 
(Gere fe) hereon eO.O550) (15, 33)(0O: 0680); 12.33 
Orme 3o51 15. 77\20. 2331 700 |NOlT 3697 1. 84 
Average 64. 80 77.40 86.03 
| O4V07, 19. SaveO. 0528) 1/15. GO WC0. 0678) | 11. 38 
A 
MPT) 28 Vor EMC%, By] ‘aaa 
and 
U=Mp>(t) + 2-9 Var lM CuA 4 (3.16) 
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where Z,.2 is the upper 1-2 point of the standard normal 


distribuenen: Tables I and II show the results of 10 
independent Simulations for the Ori gilan and the 
bias-corrected estimate. Each simulation was replicated 300 
times. Tables report the average and standard deviation of 


the normal confidence interval length; the proportion, > o£ 
the intervals that covers the true value; the proportion of 
intervals that are too high, (e.g. M(€) < Di =eanchee 
proportion of intervals that are too low,(e.g. M(t) > U ). 
Since the simulation replications are independent, it is 
possible to assess the uncertainty of D. If the confidence 
interval procedure 1S correct, then p should be within 
approximately + 2/2 of 1-. The coverage rate in the 
tables indicate that the parametric estimates tend to 
underestimated. Obviously, the distributicnewen Mp( t) is 
skewed right. However, the confidence interval procedure 
works’ well, regardless for both the original and the 
bias-corrected estimate. Both have a variable coverage 
rate. The difference of performance between two estimates 


isnot “sidgnitiveane. 


ey OTHER SERVICE TIME DISTRIBULIONS 
ae Mixed Exponential Service Time 


In this subsection, service times having a mixed 








exponential parametric form will be considered. Customers 
arrive according to a Poisson process with unknown rate 


\ which must be estimated. Customers are Of two types ay ae 


PiECD a2 Pie yor pie eetra customer's service time is exponential 
With mean WW; wlth probability 77257 the customer's service 
time is exponential with mean JA,. If T is the service time 


of an arbitrary customer, then 


P(T > t) = B, expl<-t/u, |] + Puexpite es (3.108 


Yao 


miemweer pet Foo = I. Pech omecadse, Lie “mean number of 


customers being served at time t is 
Mme) = Aj fy Cl exp Ct /uy)) > Poa C1 - exp (-t/uard$( 3. 18) 


We will assume that a customer's type and service time are 


observable. 


“nn 
4 
ca 
S 
= 
we 
tad 
3 





PEgure 3. 3 Average Relative Bias of Mo(t) LOT 
Mixed Exponential with 4=2, 4W.=.75, PP) =.2 at t=l 
( n=S50, r=300 ) 


A simulation experiment was done to assess’ the 
performance of Mp(t). In the ith replication, service times 
fetes) Customers were generated, where the proportion P, of 
them have a type 1 and the proportion P, have a type 2. A 
random number was drawn to determine the type of customer. 
If a customer was of type i, a service time was generated 
from the exponential distribution with mean ui. For the 
Simulation, the proportion P, is assumed known. The 50 


independent interarrival times were also generated. Im Gnis 
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Figure 3.4 Average Relative Square Error of Mp(t) Ons 
Mixed Exponential with W=2, W=.75, Py=.2 at t=l1 
( n=50, r=300 ) 


case , P, =0.2, U,=2., mi=0.75, and j=1. The estimate Mp(t) 
was computed; the estimate of PB is the proportion Weg 
customers that were type 1; the estimate of the mean service 
time Us was the mean service time of type 1 customers; the 
estimate of A was the mean interarrival time. The estimate 
M p(t) of M(t) assuming that the service time distribution is 


exponential was also computed, that is, 


Me( t) = u (l-expl -t/al ) (3 i) 


w 
with equal to the mean service time and X is equal to 


the mean interarrival time. The procedure was replicated 
300 times. Results of the simulation appear in Figures 3.3 
and 3./4. St) is the average relative bias of the estimate 


and A(t) is the average relative square error as computed 

in @€quation 623). anaes The solid line is for the 
A 

correct parameéerice model “estimate a ar. A The dotted line 


represents the exponential model estimate M,(t). 
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Mirmooth ~ figures, we compare the correct model 
(represented by a solid line) with the erroneous exponential 
model. As expected, the exponential model clearly does not 
perform as well as the mixed exponential model. In Figure 
3.3, the level of bias for the exponential model is very 
high early in time, but is reduced, and stabilizes as t 2m. 
Notice that the estimate using the exponential model appears 
WoO large. The average relative square error of the 
estimates is shown in Figure 3.4. Although the results of 
exponential model shows a slightly higher mean square error 
than that of the mixed exponential model, the results of 
exponential model are not too much worse. tis 1S noc 
Surprising, since as t 9@ the estimate just depends on the 
mean. 


ie Gamma Service Time 








In this subsection, the parametric form of the 
service time distribution is gamma. The probability density 


function is 
cad a K YK-1  -at 
fer) = --- Bt” e (2520) 
kK) 
where k and @ are strictly positive parameters of the 


distribution, and k is further assumed to be an integer. By 


successive integrations by parts, we get 


K-{ A 
F(t) =1- > eg ** ese) 
AzO 4 | 


its mean and standard deviation are 


| 
M= < G = Ji 
8 B 
aus, k is the parameter that specifies the degree of 
variability of the service times relative to the mean. 
Since the arrival process is Poisson with rate ji, 


the interarrival times are mutually independent, positive 


Z2 


random variables with the distributien funeeren 
G( y)=1-EXP(- Ay), where A must be estimated from 
interarrival data yy. 7 a—ecomm Thus, the Mo( t) in tie 
case is obtained by the successive integrations by parts of 
the survival function of the gamma distribution, that is 


n ri 
S (1s e8t (at 
- )5 (3.225 
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Figure 3.5 Average Relative Biasjou Mega) 
for Gamma with @=1, k=2 at t=l. (n=50, r=300 ) 


The performance of M > (t) was evaluated by the 
Simulation. In the simulation, the parameter k of the gamma 
distribution was assumed to be known, but the rate of the 
arrival process is unknown and is estimated. Two simulation 
cases were run. In the ith replication of the simulation, 
independent service times were generated, where the first 
Simulation case used the gamma distribution having 3=1 and 
k=2 and the second case used the gamma distribution having 
8=0.5 and k=4; 50 independent interarrival times having 


8=1 were also generated. For k=2, the estimate is 
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A 


M(t) = \-3-(1-expl -8¢] - t-expl -at) } (3.23) 


and for k=4, the estimate is 


i _ ee Ms _ ae Aes Doe 
M(t)  { : (l-exp[ -at] exp[- 4t](3t + at’ + E *o)} 


cs ‘ A nN ; 
where \=n/= y. and gf =kn/Z x, were calculated. An estimate 
>t q=\ 
based on an erroneous exponential service time model 
parametric estimator was also calculated 


A A A 


Me(t) = AM (1-exp[-t/i] ) (3.25) 
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Figure 3.6 Average Relative Bias of Mp(t) 
for Gamma with @=.5, k=4 at t=l. ( n=50, r=300 ) 


The simulation was replicated 300 times. The 
average relative bias @§(t) and the average relative square 
error 6(t) were calculated. These results appear in 
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Eirqurve-3. 7 Average Relative Square Error of M (t) 
for Gamma with ®@=l1, kK=2 at t=l. ( n=50, r=300 ) 


BME S? Si-5. JSC ocean Alcina. The dotted lines in the 
figures show the results of the erroneous exponential model. 
The solid line represents the results of the gamma model. 
The figures clearly show that the performance of the 
exponential model is not as good as that of the gamma model 
ally geal syle r= WM ge However, both models have the same equlibrium 
Stace £Or “Cos 0r ts, "approximacely. Figure 3.5 shows the 
average relative bias of the estimate in the case of k=2 and 
Figure 3.6 shows the same in the case of k=4. In bets 
figures, the erroneous exponential model has a high level of 
bias and the bias of the gamma model is almost constant; 
however, both models have exactly the same limiting value as 
t7® . The average relative square error appears in the 
FIGUrES, 3.9) y Banc oe. Figure 3.7 represents the average 
relative square error of Mp( t) and Me(t). in case of k=2, 
and Figure 3.8 represents the same in the case of k=4. hig 
Figure 3.7, the exponential model has a poorer performance 


than the gamma model, but the difference is small. E15 


ZS 


Figure 3.8, the exponential model has a large value of mean 
square error initially and the level of mean square error 
associated with the exponential model grows as the value of 


parameter k is increased. 
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Bagure 32d Pe Gateunelatawe  ojduace omnor OL M>(c) 
for Gamma with @#=.5, k=4 at t=1l. ( n=50, r=300 ) 


or Lognormal Service Time 
In this subsection, the service time distribution is 





assumed to be lognormal. Let X be a random variable, and 
let a new random variable Y be defined as Y=ln X. rf’ Y has 
memermal distribution, then X is said to have a lognormal 
geese rl Oution. The density of a lognormal distribution is 
given by 
i 2) Seer Aiba eS jae (3.26) 
eons Sa) 
ak 
where -@< § <c and 9 >0. Set Y=ln X-8 and by the 


Zo 


integration 








P{X¢ x} = BY (x) 
a LX -§ 
) aie | exp l-y/ag*] dy 
= Gy (BEY (35298 


where G.(-) 1s the standard normal distribution function. 
If the service time has a lognormal distribution, then the 
estimated mean number of customers being served at time t, 
Melt), is obtained by integration-by-parts 


A 


Mp(t) = A {tl1-E(t)] + (* sE( s)ds} (3. 28) 
Substituting equation 3.26 for £(t) and equation W3. 27 eee 
F(t) in the equation 3.28, we obtain 


melt) = ALEC Gy (BRE 4 exptse SG, (HES y3 (3.29) 


where Ds) is the standard normal distribution 2uncriom 
The lognormal distribution is positively skewed and the 
level of skewness depends upon the value of mean and 
Varlance.._ Of Ene- “aistribyeion:. If the value of mean is 
decreased but the variance is increased, then the shape of 
distribution tends to be more skewed and it approaches the 
shape of the exponential model. 

The performance of the parametric estimate was 
assessed by simulation. 15j@ each replication of the 
Simulation, 50 independent lognormal service times and 50 
independent exponential interarrival times were generated. 
The simulation generated two sets of the service times. One 
set of service times is from the lognormal distrbution with 
3 =0.193 and inal ,and the other set is from the lognormal 
distribution with § =0.568 and 6 =0.5. To estimate the mean 


and variance from the data, the logarithm of the service 
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Bagure 3.29 Premage wkelabtive Bias of Mo{(t) form 
Lognormal with 8 =.193, ¢ =l at t=l. ( n=50, r=300 ) 


AVERAGE RELATIVE BIAS 





Buguce 3. 10 Average Relative Bias of Mp(t) Lots 
Lognormal with §=.568, ¢ =.5 at t=l. ( n=50, r=300 ) 


fome Gata, tCi,=ln x;, i=l to n was computed. The mean and 


variance are expected by 


ouk 


ae sh 2 
ae ee 
Thus the estimate 1s 


a 


Mp(t) =Aft-Ti-Gy(ata8 54 + expC ore >, Cals) (3.30) 


“ . . . 
where A is the estimate of the arrival tate: An estimate 


based on an erroneous exponential model 
A A A A 
Me(t) =A-4 (1 - exp[-t/a] ) Gac) 


A a 
was also computed, where A= pH Oey 
AM 


2 
% 
3 
3 
> 3 
2 


R 





Ebene: se) Average Relative Square Error of Mo( t) 
for Lognormal with §=.193, § =1 at t=l. ( n=50, r=300 ) 


The simulation was replicated 300 times. Figure 3.8 
shows the tendency of 6,(t), the average relative bias of 
Meee for the correct model (shown with a solid line) and 
the erroneous exponential model (shown with a dotted line) 
for the simulation with §=0.193 ama § =1 in Figure 3.9, and 
with §$=0.568 and §=0.5 in Figure 3.10. 


32 


0.04 0.05 


0.03 





0.02 


TIVE MEAN SQUARE ERROR 


R 
0.01 


Bere e. So 12 Average Relative Square Error of M,(t) 
for Lognormal with § =. 568, A SSS te oe ( n=50, r=300 ) 


Meee Doth figures, the average relative bias of the 
exponential model is also large initially. As expected, the 
average relative bias of the exponential model in Figure 
Seo is larger than the results in Figure 3.9. As t 70, 
the exponential model shows better performance and has a 
Brmtcing value. Figures 3.11 and 3.12 show the tendency of 
Sumer the average relative square error, for the correct 
model (shown with a solid) and the erroneous exponential 
model (shown with a dotted line). Bot ~Piqgure 2s.) the 
parameters, 8 =0.193 and Ge, abe used to generate data. 
Pee tor Figure 3.12 the parameters, $=0.568 and f =0.5, are 
used. The value of average relative square error of the 
ergeemencial model in Figure 3.12 is also higher than the 


mesiles in Figure 3.11. 


D. SUMMARY 
The general conclusions of this chapter are that the 


Beeametric estimation method is a highly efficient for 


ae 


obtaining estimates of M(t) whenever the correct assumption 
for the model 1s Given. The structure of the estimate of 
M(t) is clearly biased since it is a nonlinear function of 
the estimated parameters for the service time distribution 
and the customer arrival rate. However, for the service 
time distributions considered the indications aremetnacmame 
bias is small. Hence the parametric estimation method 
performs very well, whether or not the estimate is corrected 
for bias, when the correct parametric form is used. However 
the performance of the parametric estimation 1S very poor 
when the wrong parametric model is used. For instance, the 
erroneous exponential model often has a high level of bias 
and mean=-squared error. Notice that the exponential model 
converges to the same limiting value as the correct model as 
t xo in all the cases considered. This is because as t 7@ 
all estimates use the mean service time to estimate the 


integral of the servivor functions. 
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IV. NONPARAMETRIC ESTIMATION 


woe DESCRIPTION 

Nonparametric methods are statistical techniques which 
are applicable regardless of the form of the distribution 
function that the measurement comes from. Tamehis shapter, 
these techniques will, for the most part, be based on the 
order statistics. 

foc en we ccnOcesa candom sample from a CDE EF, 
and let Sis, 0 SEO rans denote that corresponding order 


statistics. Then the sample CDF is defined by 


F, (+) - + (number Ore S,, less than or equal to t) 
ee 
Pin se Lice 44 id 
For fixed time t, F.(t) is am statistic simee We is a 
function of the sample. In fact, for fixed time t, F,(t) 


has the same distribution as that of the sample mean of a 
Bernoulli random variable. We know by the central limit 
theorem that F,(t) 1s a asymptotically normally distributed 
with mean F(t) and variance ( = )E(t)[1-F(t)]. 

Recall that (in chapter II) the mean number of customers 
being served at time t, M(t), is a function of the arrival 
rate \ and the survivor function of the service times. 
memce a nonparametric estimator, denoted by M(t), can be 
represented by the estimated values of i and F(t). The 


estimated survivor function of service time is 


Aw 


Be(t) = 1 132 OSES sche 
n-Aa ; 
Be os ie Soot —=—S PO 12 
r (A) Ort) 
0 if et, > Som 
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Now, using the fact that M,(t)= \ \* F(s)ds, we obtain a 


nonparametric esimate 


A A 
Ml oe eer IN ic if Oe ac 
s eo N-4 ; 
A[ -- _ t+ wee t if Sse 
oe fv) 
Ar ee : 
AT es a! 2 1 Sas (4.1) 
A 
where A is the estimated arrival rate. Note chat the 
nonparmetric estimate has a limiting value as tv7om , that 
A 
1S, wet My(t)=Am where m is the méeanvsenpviceseamnce 
a A? 0) 
In €£hvs <hapeer, we Will consider two differenms 
situations. In one case, we will assume that the arrival 


rate is known but the distribution of service times is 
unknown and must be estimated. Based on this assumption, 
A 

My(t) is expressed simply in terms of the order statis ieee 


of the service times as follows 


M : mk 4.2 
3) -- t --- t 
NCO WG Cab Se) (4.2) 
< . e 
when Toes Ske) In the Appendix A, we derive the 
chil sepgal obiealepg cope Miia Calig\y Tele@lts! Melsisysy Its mean and varianee 
are 
A tex 
E(My(t)] =A\F(s)ds (4.3) 
A \ t ome 
Var{[M,(t)] = ~ { \osF(ds) - { S'sF(ds)]? + t?E(t)F(t) 
= t 
- 2tF(t)( |°sE(ds)] } (4.4) 
A 
Thus, Myj(t) 258 an unbiased estimate ofa iGe Further, as 
A 
the sample “size m is “increased aa iia is asymptotically 
normal. Thus an approximate normal 100(1- )% confidence 


interval for M(t) is given by 


S16 


M(t) aE Za | Varillr)!| (45) 


where z, a is the upper 1-% point of standard normal 
es 


@istbibution and Var[ My(t)] is given in Appendix A. In this 
chapter, we will also study the jackknife and the bootstrap 
procedures for obtaining confidence intervals for My(t) ata 
the case in which i is known. In the second case, we will 
assume that the arrival rate xX is also unknown and must be 
estimated. Then, a nonparametric estimate Ma(t) is the 
product of two estimates, 


A 
A 


M(t) =\[t-2s + 75-t] (4.6) 


acl 


~ A 
where \=n/2y, and K is the number of service times that 


are less than or equal to t. There are no exact functional 
forms for the mean and variance of May (t) Peds sGaicee 
However, the jackknife and the bootstrap methods can be used 
to obtain confidence intervals. This will be described 


below. 


B. JACKKNIFE ESTIMATION METHOD 

In this section, we will study the jackknife procedure 
Bor obtaining a confidence interval for My(t). The 
jackknife was first introduced by Quenouille (1949) for the 
purpose of reducing the estimate bias, and the procedure was 
later utilized by Tukey (1958), to develop a general method 
for obtaining approximate confidence intervals [Refs. 7,8]. 

The basic idea of the jackkKnife estimation method is to 
assess the effect of each of the groups into which the data 
have been divided, not by the results for that group alone, 
but rather through the effect upon the body of data that 
fests from omitting that group. The two bases of the 
jackknife are that we make the desired calculation for all 
the data, and then, after dividing the data into groups, we 


make the calculations for each of the slightly reduced 


oF, 


bodies of data obtained by leaving out just one of the 


groups. A special case of Jackknife estimation is called 
the "complete jackknife estimation", where the number of 
subgroups is n (the size of sample); the i* subgroup is 


obtained by deleting the ith observation; thus the size of 
each subgroup is n-l [Ref. 9]. Attention will be restricted 
to complete jackkKnife estimation in this study. 

Let M A(t) be the estimated mean number of customers 
being served at time t on the portion of the sample that 
omits the th sample. Let Maa (t) be the corresponding 
estimator for the entire sample and define the i 
Pseudo-value by 


A aN one 
M.(t) = nM, (t) - (n-1)M,,(t) (4.7) 


A “a ae 
The Jackknife estimate M.(t) and an estimate Ss of is 


variance are given by 


A TAR 

eee <- & M(t) (4.8) 
ae a ls | x 2 A > 

Si. = pene [M,(t) - M,(t)] (420 


Tukey (1958) proposes that the n estimated pseudo values be 
treated as approximately independent and identica lis, 


distributed random variables [Ref. 9]. Hence, the statistic 


A A 
Jn ( M>cb — May) ) 
ieee A » =%a 
Ca SCH - My > J 
( 4G) 
has an approximate t-distribution with n-l degrees of 
freedom, which leads to the approximate 100(1-% )% 


confidence interval 


A A 
MeCt) +) “cememce ee il) 


= 
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where t \-a is the upper es GhieLTCcal@upoint “of — the 
t-distribution with n-1 degrees of freedom. The confidence 
interval given by equation 4.11 is a function of the 
estimated variance. In the remainder of this section, we 
will describe several methods of implementing the confidence 
tiicerval procedure. We will also obtain an analytic 
expression for the jackknife estimate and its variance 
Betimate for the case in which the arrival rate A is known. 
an Jackknife Estimate with Known Arrival Rate 

In this subsection, the arrival rate is asssumed 
known. In this case the closed form expression for the 
jackknife estimate and its variance estimate can be derived. 

The nonparametric estimate of the mean number of 
customers being served at time t, Mt), can be expressed in 


terms of the service times as follows: 


‘=: 


= 2 a ee . 
aA aay > n t] (4.12) 


where Suy s are the order statistics of independent and 
identically distributed random quantities from the unknown 
probability distribution F, and the variable K is the number 
Pee, = Which are less than t. This equation shows 
immediately that My(t) 1s the linear function of the order 
statistics of service times. 


The jackknife estimate is based on sequentially 


deleting point Sa, and recomputing the estimator. Removing 
Point Suy from data set gives a different empirical 
“w 

eeompability distribution Fy. With mass as at Say + Qay 
pe tant aay? Fay and a corresponding recomputed value of 
the estimate. In the jackknife process, the qth pseudo 
value is 

A A 

M;(t) =( At eee VIG 

A 
A Els  akte aL SOx (4.13) 


oa 


for tne tinea Ciniemse: Accordingly the pseudo values M, (t) 
A 


have just K+l different values. The jackknife estimate is 


A 

A Be \ K 

ae = A sy gears | (4.14) 

This result is exactly the same as the original estimate. 
A 

This is because the estimate M,(t) 1s unbiased. In Appemelmm 


B, the jackknife variance estimate is derived as follows: 
A A A 


Varl M,(t)] = Siegel Se ~ OLE By aS Joc (-3 te 
—~\ M xs “(4D = “Ay Fs m iat) 
A a 
eure Uaae ae) | (4.15) 
a 
where F,(t) is the ‘Sample survivor “Vic eine Comparing 


equation 4.15 with equation 4.3 we see iwek-ne (-25-)[ Var (Mi, 
(t)}] = E[ Var{M.(t)}]. Thus the jackknife variance estimate 
tends to be conservative in the sense that its expectation 
is greater than the true variance of M(t); We will now 
describe two selected procedures to obtain confidenee 
intervals for the jackknife estimate. Tukey suggested that 
the Stavilscic ein equation 4.10 has an approximate 
t-distribution with n-1 degrees of freedom, which leads to 


the approximate two-sided 100(1-&)% confidence interval 


M(t) “r t -a] Vari M,(t)] (4.16) 


fore).  wiere c -% is the upper 1-3 critical polnties 


the t-distribution with n-1l degrees of freedom. However, 
the n estimate pseudo values have just K+ different values. 
Hence, another possible procedure is to adjust the degrees 
of freedom of the t-distribution, that is, subtract one from 
the number of different pseudo values (K+l), and use the 
result as the degrees of freedom. The length of confidence 
interval generated using the adjusted degrees of freedom (K) 


is slightly wider than that generated using the usual 
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degrees of freedom (n-1) and the coverage rate should be 
increased. 
Ze Jackknife Estimate with Unknown Arrival Rate 

In this subsection, it will be assumed that the rate 
of Poisson arrival process is unknown and must also be 
estimated. The maximum likelihood estimate is N=n/e a 
where y; is the interarrival time between i* and Ga). 
customers. A nonparametric estimate of mean number of 


customers being served at time t 1s given by 


A 
» <a Mm — kK 
= ire ees 
M(t) Dee os aa t} (4.17) 
Meee se, s ane the order statistics of independent, 


identically distributed random quantities from the unknown 
probability distribution F. It is assumed that the S.i)s and 
Y,'s are independent. The variable K is the number of S 's 
which are less than t. The data consist of two independent 
random samples, 

eae, ~ 2 and ZY, Yager tn ~ © 
F and Q being two possibly different distribution on the 
Beal line with Q, the Se ere UE Tor with mean see 
From equation 4.17, the estimate My(t) is the product of two 
eistimates. One is the function of y; , KaWlEr, and the 
meget is the function of s. , H(s)=—y = Ss. + = t. there are 
many possible ways’ to perform a two-sample jackknife 
procedure. We will call one method the "paired sample 
jackknife" procedure. Since the size of both samples is the 
same, we make the one set of observations by pairing 
respective observations, that is, CSV, CS Vee CS Ym 
). AS with the one-sample jackknife, we estimate the M,,,(t) 
for all the data, and then, we estimate M(t) based on the 
remaining data obtained by leaving out just the i* pair. 
Thus the ith pseudo value M z(t) is 


M.(t) = nM .(t) - (n-1)M4,(¢) (4.18) 


41 


A 
and the jackknife estimate M,(t) and variance estimates are 
given by 
me) See 
M; ( ) =a Aer) 


; x A A : 
= ------ M.(t) - M 
See age ale) ete) 
Based on these statistics, an approximate two-sided 100 


(1-“)% confidence interval is given by 


A 
Ma@ey “iS (eens (4.19) 
S19 
where t ;.~ is the upper 1-2 point of t-distribution Wises 
pe 
n-l degrees of freedom. A second method is called the 
"Separated sample jackknife" procedure. Since we assumed 


that the X,'s and Y,;'’s are independent, we can perform the 
jackknife procedure separately for each sample, and then, 
estimates which combine jackknife estimates and the 
jackknife variance estimate can be computed. 

Let M5, (t) be the jackknife estimate of A and Vy be 


the jackknite var vance -estimace 9: Ow Ker M(t) bee te 
jackknife estimate of \* F(s)ds and Ve be its jackkKnife 
variance estimate. Then the combined jackknife estimate of 
Mae (t) as 

A A A 

My. (t) = Ms, (t)-M,.(%) (4.20) 


and the combined jackknife variance estimate is 


A 
> A Le 


as Vy Vg + Vy [My (t)1? + Vs (My, (t)1* (4.21) 


The approximate two-sided 100 (1-&) % confidence interval is 


given by 


A 


A 
Mee) mee tg Se A (4.22) 
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where t 4 is the upper 1-2 point of t-distribution with n-1 


degrees of freedom. 


ee SDOOTSLRAP ESTIMATION METHOD 

mepen( 1979 ) introduced the bootstrap method saolig 
estimating the distribution of a statistic computed from 
observations [Ref. 10]. The bootstrap estimate is obtained 
by replacing the unknown distribution by the empirical 
Pest raioution of the data in the definition of the 
seaeistical function. Im practice, the distribution of the 
statistic is approximated by Monte Carlo methods. 

For convenience, the arrival rate is assumed to be Known 
and equal to 1, then the nonparametric estimate Ma( t) is 
just a function of service times. This 1s a one-sample 
problem. The bootstrap procedure is as follows: 

1. suppose that the data points xX .,Xn are 


puma enaent observations from the Geena distribution 
hen the true estimate is 


Mit) = (* (s)ds (Ame3) 


2. We can estimate the distribution F by the empirical 
probability distribution Fn. 
By : fee 5s on each observed data point x; 
Tie a Ae 


3. The bootstrap estimate of My(t) is 


M(t) = i. Fn(s)ds (4.24) 


A 
tieeobtains am estimate of variability »for M,(t), we procede 


as follows 


mye Construct F, Ene emplerealVomseributcion function, 
as just described. 
(2) Draw a_ bootstrap sample  x‘* a niet ae by 


independent random sampling from Ey 
Notice that we are not getting a permutation distribution 


Since the values of x are selected with replacement from 
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the SetyCx, 7 7) 7) As a point of comparison, the 
ordinary jackknife can be thought of as drawing samples of 
size n-l without replacement. 


(3) Compute an Crees Of Viixyecatey for each. bootstrap 
replication, Bee) that is, the value Of statisere 
evaluated for t bootstrap sample. 


Ms(t) = - =x: 


- # 
Kec eS 


+ sel n= Z1( x¥Sty]t (4.25) 


where I(x ao Ties 


O otherwise 
(4) Do step (2) some large number "B" times wes rae 
(Ey pen setike peewee ray replications Mi = 
Based on the bootstrap replications, the approximate 


estimate of M,(t) and its variance are obtained by 


a oe. 
Ma(t)= 4- 2My (t) (4. 26) 


— 


A pee A 2 
Var[Ma(t)] = ---=[MQ(t) - Mat)! (4.998 


A 
A formula for the conditional variance of M,(t) given the 


Original sample data is derived in Appendix C. Ths 
expression is given by s P A 
=e Ba | m-K , K 
Var MCE) = Sra 2 x2 ia [4s Sx] 2 + eel rs elisa 
: K 
Ha \ 
BA ya } (4. 28) 


Notice that the expected value of the conditional variance 
of the bootstrap estimate is approximately equal to the 
Variance of the nonparametric estimate Sect iar which is 
derived in Appendix A. 

So far we have considered the problem, where the arrival 
rate is Known. The bootstarp methodology also applies if 


the arrival rate is unknown and is estimated from 
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a ‘ 


interarrival data. Suppose the data consist of a random 


sample X=(X, ,X, ,...,X from unknown service time 


) 
distribution F and an dent sample Y=(Y,,Ya ,..-7,Ym) 
from the exponential interarrival time distrbution G with 
unknown parameter ). One bootstrap procedure to estimate 
the expected number of customers being served at time t is 
to construct Fy, and Gn , the empirical probability 
@estribution corresponding to F and G. Bootstrap samples 
Ky~F,, i=1,2,...n, Yj~G,, j=1,2,...n, are independently 


drawn, an estimate of M,,(t) 


k m ra i. x 
M(t) = -e2-- -- = x* + --[n - ZBI x*St)]t 4.29 
u(t) sel GZ, f+ -s[n - 2 (xis t)] t] ) 
vA 
is calculated. As before there are a large number B of 
bootstrap replications. For this case, the bootstrap 


Poemace OL M(t) and its variance are still given by 
equations 4.26 and 4.27. There appears to be no closed form 
of the analytical variance of Mp (t) ain this case. Now we 
will describe methods to obtain approximate confidence 
intervals for the bootstrap estimate M(t). 
l. The Percentile Method 

A simple method for assigning approximate confidence 
micervals to the nonparametric estimate M,j(t) is as follows: 
Let 


oe) aa - (4.30) 


Bemeene cumulative distribution function of the bootstrap 


Seectribwtion of M,y (t); Bis the number of bootstrap 


replications. For a given 0<%<0.5, define 
A A o{ “aw A 
L(x) = C () / ied = © (1=) 
Usually denoted simply by L_ and U. This definition runs 


into complications when we actually try to compute quantiles 


L and U from a set of bootstrap replications. To overcome 
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these difficulties, we order the bootstrap replications ime. 


smallest to largest, obtaining the sorted data M(t), fom 


b= ilicie, Vex Letting represent any fraction between O and 
1; take O(%) to be mot) whenever Q is one of the 
functions ASE ae tor i= to ones L(x) turns out wee 


be the (B *X+ 0.5)" M(t) and U(a) to be the (B * (1-%) 
+0.5)" M™°(t). The percentile method consists of taking 


O reune Wes) (4. 315 


as an IEICE Sm eNec l=2c confidence “interval tor Ma(t) since 
% =C(L), 1-K=C(U), the percentile method interval consists 
Ot the central 1-2 & DEOpCr cor of the bootstrap 
distribucion. 
Zi The Bias-corrected Percentile Method 

Efron( 1980) suggests the following bias correction 
for the percentile confidence interval procedure [Ref. 11]. 
He argues that if M a(t) is not the median of the bootstrap 
replication distribution, then a bias correction to the 


percentile method is called for. To be specific, define 
-| “A A 
Zo = $ [C(M,(t))] ( 4. 325 
riMogdyct } 
8 


A 
where C(t)= 


cumulative distribution function for a Standard )neuwa 


as in equation 4.30, and is the 


variate. The bias corrected percentile method consists of 
taking 
A A ~\ 

[fC 19(2z.- z,)} , © 1a (2zeeee ee (452c5 
aS an approximate 1-20 central confidence interval Low 
A 
ata) Here Zy,is the upper point for a standard normal 
Ont2 5) — tare 
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Netice that if Ma(t) is the median of the bootstrap 
distribution then 2,=0O and equation 4.33 reduces to equation 
4.31, the uncorrected percentile interval. However, even 
small differences of Pr{M (t) iS Me(t)} from 0.5 can make 


equation 4.33 much different from equation 4.31. 
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V. SIMULATION RESULTS 


The purpose of the simulation in this Chapter i150 
assess the performance of the nonparametic estimation 
methods, the jackknife andthe bootstrap. Since the 
estimate of M(t), the mean number of customers being served 
aG ime (tc; is a function of the customer arrival rate and 
the integral of the survivor function of the service time 
distribution, two simulations cases are done. The fier 
Simulation case was performed to estimate M,(t), the 
nonparametric estimate of M(t), as a function of the service 
times with the arrival rate assumed to be known and set 
equal to l. For this case, the jackknife and bootstrap 
estimate of the variance were derived in the chapter IV, and 
compared with the numerical estimate obtained by the 
simulation. The second case assumed that the customer 
arrival rate is also unknown and must be estimated using 
interarrival times. 

In each replication of the simulation for case ll, 50 


independent service times from a specified service time 


distribution were generated. For the bootstrap procedure, 
500 bootstrap replications were performed. The simulation 
was replicated 300 times. For the purposes of comparison, 


we considered four types of service time distributions, 


which were the exponential, the mixed exponential, the 
gamma, and the lognormal distribution. The arrival process 
is known to be Poisson process with Known rate j=l. The 


Same generated service times were used for each estimation 
procedure ina replication. This reduces the variability of 
the differences in performance between the procedures. All 
programming was done on IBM 3033 computer at the Naval 
Postgraduate school using the LLRANDOMII, random number 
generating package [Ref. 6]. 
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TABLE III 


STATISTICAL DATA OF ESTIMATE OF M(T) 
N=50 R=300 (B=500) 


True Parametric Nonparametric 

value |Correct \gatey are Jack Boot 

Exponential Cnr Seog 7830 - {Es VAe 0, 7619 
A = 2) GOnOZGS:) (0.0451)1( 0. 0452) 
Mixed expon. OMS 722 Ons 3) 7 dl 0. 6246 Oro ol Of a989 
M=2,U=. 75,0,=. 2) COmoo7e ne CeO026) (0.0512) |( 0.0512) 
Gamma O. 8963 O 2 On sel O. 8965 O. 8967 
( A=1, K =2) (Ome ees | ( O- ©0110) |( 0.20325 yie0. 0325) 
Lognorma O. 8094 O. 8140 O. 7844 OFelsg 0.8138 
(e=. 193, f-=1) (Ome ez). ( 02070) | (0, 0383,)1|@O. OS 83) 


Table III presents the results of several estimation 
methods when the arrival rate is given and equal to l. The 
top of each cell gives the mean estimate of M(t) at time 
t=1, where M( t)={ F(s)ds. The bottom part of each cell 
gives the standard deviation of the estimate. For service 
time distributions other than exponential, a parametric 
estimate based onan erroneous exponential model is also 
given. The estimate in the case of an exponential model is 
[1-EXP{-t/u}] where mM is the mean service time. For each 
service time distribution, the standard deviation of the 
parametric estimate of M(t) is smaller than that of the 
nonparametric estimate of M(t). That is, the efficiency of 
the parametric estimation method is better than the 
efficiency of the nonparametric estimation method. However, 
the results of a parametric fit assuming an erroneous 
exponential model show the worst performance. The true 
Value of M(t) is not included within plus or minus three 
standard deviations of the erroneous estimate M(t). In the 
table, the nonparametric estimation methods seem to perform 
well in all cases with the cost of an inflation of variance. 
Hence the nonparametric estimation method is to be preferred 


when the service distribution is unknown. 
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To illustrate the efficrene, of the nonparametric 
estimation methods, we simulated two possible ways to 
construct the approximate confidence interval for My(t) for 
the bootstrap and the jackknife methods. Those ways are 
presented in chapter IV. For the jackknife estimatiem 
method, one procedure was to construct the confidence 
interval with the regular degrees of freedom, n-1l, and the 
other used the reduced degrees of freedom, which is the 
number of different pseudo values. For the bootstrag 


estimation method, one way used the percentile method by the 


Monte Carlo process, andthe other used the bias-corrected 
percentile method; there were 500 bootstrap replications. 
Nominal “Ge 77> 2) O20 and 90% confidence intervals were 
constructed for each replication using each method. It was 


noted whether the confidence interval formed by a given 
method covered the true value M(t). The entire process was 
independently replicated with R=300 times. From these R 
replications we computed, for each method, the proportion D 
of the R confidence intervals which contained M(t), as well 
as the average length of the confidence intervals. If a 
method was performing adequately, 5 Should be near l-X , and 
a small mean length is desirable. 

Tables IV to VII show the simulation results of several 
confidence interval procedures for four types of service 
time distribution; the exponential, the mixed exponential, 
the gamma, and the lognormal. The arrival process is 
Poisson with Known arrival wate past 


In order to compare the performance of these procedures 


to the normal confidence interval procedure, Simulations 
were conducted, and nominal 68%, SOWA | and 90% confidence 


limits were constructed for time t=l1 for each replication. 
The normal confidence interval procedure is based on the 
order statistics of the service times. By the central limit 


theorem, the distribution of My(t) is asymptotically normal 
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TABLE IV 


COVERAGE AND LENGTH OF OES) ec 
FOR EXPONENTIAL WITH u=2, \ =1 AT T=1 
— = |; | 68% | 80% | 90% | 
Length Cre R maceias oe vena CeR 

(3.4) (30) [| G8) | 

Normal C.I. OmeSsss tee ss, Of 1147 1|12.00| 0.1477 4.67 
69.33 eS OF a2. OO 

Procedure GOmOOse ait cs (©. 0101) |) 4.33 )(0. 01415) 3. 30 
Reduce woroogtle iyawGe 71) O. L187 111.00) 0. 1548 Seas 

TiS 3 S5r53 94. 00 

Jack- Clits (OmOOso) ei OO1( O,0102)) 3.67)(0.0142)| 2.67 
Pee /Regqular| O. 0897 |18.38] 0.1163 |11.33] 0.1505 4.00 
Owes 84. 33 Jee 00 

dat COB OOIO)s TiS OOnG@e. ©1903); 4.33)/(0. 0144))| 3.00 

Percen-| 0.0884 {18.67}; 0.1132 [12.00] 0.1462 4.33 

ile 69.00 SZ. 00 927-00 

Boot=-} method](0. 0098)/12.33/)(0.0112)} 6.00/(0.0154)} 4.67 
strap} Bias- OmescvmlG6. 67} Onlls7 |10.67| 0.1467 7 (9h 
EOrrect a Ne (O18. 63. 00 O26 7 
mecmocne0.0098) 12. 32400, 0112)} 6.38400. 0154)| 4. 67 
distributed as the number of data points n 97@. Thus, the 


100( 1-%)% normal confidence interval is given by 


Diet) ot Z\4 /Var[ M,(t)] aloes 


where Z \-4 is the upper 1-3 point of the standard normal 
distribution and Var({M,y(t)] is given by equation A.14 in 
appendix A. 

Fach cell in the tables contain the average and standard 
deviation of confidence interval length; and the proportion 
of intervals that are too high, (e.g. M (t)<L), where L is 
the lower bound of interval; the proportion of intervals 
covering the true value M(t), p; the proportion of interval 
that are too low, (e.g. M(t)>U), where U is the upper bound 
Guanterval. Table IV is for the exponential service time 
case with with M=2; Table V is for the mixed exponential 
service time case with M,=2, 4:=0.75, and P =0.2; Table VI 


a4 


TABLES 


COVERAGE AND LENGTH OF ae oe “= 3 = 
MIXED EXPONENTIAL WITH wu, =2Z, 


sien 4 SOR 10) GZ 
Length |{C. R ee C.-wikk ee Cum 

mee Soe eS) G22) 
Notma lL’ Cri. 0.0914 |!14.671 0. L169 Mato e0 Soe 5.068 
Ten, 83.00 90.679 
Procedure (0O.0084)/13.67/(0.0102)) 7.00/(0. 0143) = 
Reduced] 0.0936 ;14.00/] 0.1208 9.6/1! 0. cl 4.33 
Fs) 83.3 9167 
Jack- alt (0. 0085)/13.001(0.0103){| 7.001(0. 0143) 4a 
knife|Regular| 0.0923 {|14.67)| O. 1185510700) *@3tse2 4.33 
i200 83.00 91. 36 
Gee (0.0085) /13.33)(0.0103)] 7. 00!( 0. O145))|\) a 
Percen-{| 0.0911 |14.33/] 0.1156 {11.33} 0.1490 43 
tile HL AUIS) Slee SS 89.67 
Boot=-| method!(0. 0094)/;14.00/|(0.0112); 7.33)(0.0153)| Sage 
strap| Bias- OO; Ooms )4..00) -Osiiom 9.00/ 0.1495 4.00 
correct 71-00 83.00 89.67 
method) (0. 0094)1/15.004(0. 0112)| &. 0OO+(020153)))) Ga 


1s for the gamma service time case with @=l and K=Z7 game 
Table VII is for the lognormal service time case with 
S-0) oer ands ie 

The overall examination of the tabulations of confidence 
limit coverage and also the average and standard deviation 
of confidence interval length suggest that the bootstrap 
procedure is slightly better than the jackknife procedure; 
however, the difference is negligible. The normal 
confidence interval is also about the same as the jackknife 
and bootstrap procedures indicating that a sample size of 50 
is large enough for the central limit theorem approximation 
to be adequate. All procedures produce almost the same 
average length of confidence interval with a good coverage 
rate, which falls within + 2/40 of 1- . Although the 
method of the reduced degrees of freedom used in the 
jackknife and the bias-correct percentile method applied in 


the bootstrap improved the coverage rate, the variance was 


a2 


TABLE VI 
COVERAGE AND LENGTH OF 190( 1-% )% on 


FOR GAMMA WITH @=1 K=2, \=1 AT 21 
- ii 68% | 80% 90% ~—s+«YI 
meng th ||\C.. -k ponies Caen Beng thai C. R. 

ee) VS Gene Coe | = 

ileormal cC. 1. Ono Jomi77, ©©O |} ©. 0774 112.33) 0.0989 |10.67 
S545. ja), SS S6u50 

Procedure er OOO two 71( 0.0120); 7.33)(0.0174)| 3.00 
RecuicedNNOm@Gl? 241.00) . 0813 |11.33)| 0.1062 SB SS! 

750k; (816. SVs eT 89.00 

Jack- lee GOmOnOZ ime OOO, ©1277) G.00)(0.0178)| 2.67 
Moeee | Regqilar| O.O0GOl |Z21.67) 0.0785 |}12.33| 0.1008 | 10. 33 
eis), Sis) SOn67 Sie om, 

af (Ome OZ) mee OO1O,O0171)) 7.00)/(0.0177)| 4.00 

Peneen=| OnO509 9) Zes3 | 0.0766 112.00) 0. 0981 O2.33 

tile 69.97 S0)..35 86,67 

Eeoe=| methnod|(0.0091)} 9.00)(0.0122)| 7.67)(0.0179)| 3.00 
strap| Bias- Ome oa, SoieOro77/ | l0.6/) 0.0990 sinas. 
Gorrect Srey, rs en. O© e7. OO 
Mewsogwiorwo lOO) | 11. 334(0., 0121); 8.33400. 0180); 4.33, 
inflated. Furthermore, the amount of improvement was small 
amar not significant. Hence, the original procedures for 


constructing the confidence interval for the jackknife and 
bootstrap are preferred in this case. Note that the 
coverage rates are skewed left slightly but almost balanced. 
It is a reason that the normal confidence interval procedure 
performs well. 

Results will now be reported for the simulation of the 
case in which the arrival rate of the Poisson process is 
also unknown and must be estimated from interarrival time 
data. More computations are required for this case; 
however, the procedure is Same. Each replication of the 
Simulation generated 50 independent service times and 50 
independent exponential interarrival times having mean 1. 
Confidence intervals were computed using both separated and 
paired jackknife procedures and the percentile method for 


the bootstrap. The number of bootstrap replications was 


ope. 


TABLE VII 


COVERAGE AND LENGTH OF 100(1-“)% Ts. 
FOR LOGNORMAL WITH $§=.193, ¢*=1, \=1 AT T=1 
a Lari. 68% | 80% | 90% 4 
pene C ae pene Sess Length |C.R 
2G255--.20c. |e (8.0) __ | ees eee 
Normal “62 FE. QO.0769 {16.67} O. LOO7 | FOsC0 Grrr 6. 6m 
71200 73.00 89.3 
Procedure (0.0075) ])12. 33|(0. 0096) ;12. 00|( 0. 0126 aa 
Reduced; 0.0787 |16.33] 0.1208 9.6/| O.a38¢ 6:38 
Pees US Sy 89567 
Jack- 2 aed (0.0076) |12.33(/(0. 0097) {|10. 67)|(0. 0127) |) 45s 
knife|Regular| 0.0763 |16.67;| O. 1021 | 20800) "Oye, 6.567 
TOSS 73300 shel SiS 
aie (0.0076) [13.00;(0. 0097) | 11.00) (0. 0128) 4a 
Percen-| 0.0763 |16.67]|] 0.0998 | WO Nes) O72 7.38 
ile 69453 Ti nOO oS. GF 
Boot=-| method) (0. 0084) /14.00|( 0. O105) | 12. 6&7) ( 0. 0133) esas 
strap| Bias- O-@76Ss | 16700) -OaloGg2 S. 67 Oizo 6). ore 
COrrecet Gero 3 fgore sc ' 88. 67 
method/( 0.0084) 115. 67]( 0.0106) |13. OG) 0) 0ls2 Sa 
LOOG: Nominal 68%, 80%, and 90% confidence limits were 
computed for each replication. The simulation was 


replicated 300 times. 

Tables VIII to xX report the results of the simulation. 
The quantities in the left part of each cell are the average 
and standard deviation (within parenthesis) of coverage 
interval length. The right part of each cell contains three 
quantities; the top value is the proportion of intervals 
that are too high; the center value is the proportion of 
intervals that cover the true value, D; and the bottom part 
is the proportion of intervals that are too low. 

In Table VIII (the case of 68% C.I1.), the average length 
from the bootstrap shows outstanding performance with a 
small value of standard deviation. The paired jackknife 
procedure performs as well as the bootstrap procedure. This 
procedure reduced the standard deviation by more than half 


of that in the separated jackknife procedure, and also 
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MigmovedsenNe COVEeLAage Kate. From the results of coverage 
rate in the table, it can be recognized immediately that the 
jackknife estimate, regardless of the application method, is 
often too low, while the bootstrap estimate tends may a 
Meetle too high but is almost balanced in the number of 
confidence intervals that are too high or too low. TE. is 
the reason that the bias-corrected percentile method was not 
required in this case. 

In this simulation, all the coverage rates fall within 
+ 2|X2D of 1-4, (62.61, 73.38). Note that the average 
length of the confidence interval in the gamma service time 
case is the highest. When the arrival rate was known, the 
gamma service time case had the smallest average length. 
This indicates that the variability of the estimated arrival 
rate may be the dominate effect in the width of the 
confidence interval. 

The results of Tables IX and xX support the facts of 
discussion about Table VIII, This is the reasonable since 
the same random numbers were used _ to compute these 
confidence interval. The presentation of Table IX is 
exactly the same as the case of table VIII. All the 
coverage rate are fall within (75.38, 84.61), though the 
value of coverage rates fluctuate over the service time 
foes tribution cases. Obviously the paired jackknife 
procedure performs very well. The bootstrap procedure still 
has the best performance; however the value of coverage rate 
fluctuates greatly for the different service time 
distributions. For the 90% confidence interval case 


reported in Table X, some coverage rate fall outside the 


range (86.53, 93.46). The separated jackknife procedure 
produces low coverage rates outside of (86.53, 93.46), 
except for the exponential service time distribution. The 


paired jackknife procedure improved the coverage rate 


tremendously. Although the average lengths in the paired 


ON) 


jackknife procedure are slightly bigger than those in the 
bootstrap procedure, the overall performance is better than 
the bootstrap. Furthermore, the procedure in the case of 
gamma service time distribution, the bootstrap produced one 
coverage rate outside of (86.53, 93.46). However, this 
could be due to sampling fluctuation. 

In general, all the confidence interval procedures 
performed very well for the exponential service time case, 
regardless of the level of the confidence interval. The 


procedures also worked well in general to produce 68% and 


80% confidence interval. However, performance was more 
variable in the 90% confidence interval case. In moore 


cases, the average length produced by the bootstrap 
procedure is the smallest, but the value of the coverage 
rate fluctuates for different service time distribution. 
The overall examination of the tabulations suggests that the 
paired jackknife procedure performs very well compared to 
the separated jackknife procedure and in some cases shows 


better performance than the bootstrap procedure. 
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TABLE VITTI 


COVERAGE AND LENGTH OF o8% Cue 
WITH UNKNOWN ARRIVAL RATE (N=50, =300, B=1000) 


Jackknife BOOeCS Trap 

Separated Paired Percentile 

sae ae Gauneimebengua |C. R poe Cron 

Gene) Seca |e Ss. o)ae 
Exponential 0.2674 Vi 010) (MN OR ae vas. S267 lO. Zo 5G eas s 
7.0: Owe S G6. 67 
( A= 2) (Ope Se? 73 1) O205/0) | 20. 00|(0.0510)715. 00 
Peeeed exponen| 0.1999 |12.00|] 0.2077 |15.00] 0.2009 |21. 67 
(a =2, 4. =. 75 G5a6/ 68. 64.00 
y= 2) (0. 0832)/22.331(0.0429)117. 00/(0. 0398) 14. 33 
Gamma Cm oly Omee i) O.2/46 | 13733) 0.2632 |21.67 
68. 00 Gye 66. O00 
eerie | On 76) 122. OO ©. 0599) | 19. 00)(0.0557))112. 33 
Lognormal On 2533 peor Cez513 710.00) 0. 24158 flo. 33 
. | 2. OO 72433 69. 67 
(§=.193, fd =1)}](0.0998)|20. 33](0. 0486) |17. 67|(0. 0461)]11. 00 
TABLE IX 


COVERAGE AND LENGTH OF oO Cyt: 
WITH UNKNOWN ARRIVAL RATE (N=50, =300, B=1000) 


Jackknife Bootstrap 

Separated Paired Percentile 

pengen Caw. Tee Ce HK. Length |C.R 

ee (Ss. d) (S.a) S. 

Exponential 0.3447 Seon) O24 5202 Se OOM MOr Ss NOZs Zee 7 
79.67 ez. 00 S23. 00 
( M= 2) (Ome Zo nis. OOO, O64) ))10.001(0.0585)) 4.33 
Mixed exponen] 0.2558 O. 2688 Sow |LsOazooo ple. 67 
(& =2, w=. 75 75.67 SZ os O08, 
Ey. 2) CO71088 ) (0O.0569)]12.00;(0.0490)| 4.33 
Gamma O. 3646 5. OO}, 0. 3504 S267se-O. S575. | to.33 
7 S6g, Sn oy Vow ew! 
ene es —2 | ( O. 1509) 119. 3aaK 0. 0703)}11.67)(0.0657)]| 8.00 
Lognormal OrsZ 9 Ae Gratin 323 1. 6233) 0.311594: 67 
a Paes Poms 3 i>. 67 
(§=.193, f =1)}(0.1279)|16. 67] (0. 0624)|14. 33|(0.0585)/10. 00 


a 


TABER Ss 


COVERAGE AND LENGTH OF 20 Ga Cale 
WITH UNKNOWN ARRIVAL RATE (N=50, =300, B=1000) 


Jackknife - Bootstrap 

Separated Paired Percentile 

pene Ce eR eee Ca peng Cae: 

: +3 | See _.|.08:°)__ | ee 

Exponential 0. 4464 L333) (Ona 2 oS! 2. 67| 0.40om 8. Ga, 
90. 00 922353 89.00 
(M= 2) (O.1680)| 8.67/(0.0844){] 5.00|(0.0/61) | Zea 
ee Ae ee Oe SIS O67) Onssse 2:33) O-ssion 8. Oe 
bo=, 75 Save 35 Soe OO 86. 6a 
Sa (0. 1213) ;16. 00! (0.0679) ) 9.67 | GO} OCGO2t yy aan 
Gamma 0. 4891 2.00; O. 4586 3.33) 0: 4416 9.00 
Sia), oe Sicha (59 85. 6@ 
( ®=1, k=2) |(0.2321))12.67)(0. 1053)) 8. 00)/{ 0. 0959) | sae 
Lognormal 0.4371 1.678 O. 4228) WS O00) BO. er 4 7 SO@ 
" }S3- 33 | 29: 88 \Paaee 
(§=.193, fd =1)|(0. 1974)713. 3311( 0. 0920) | 7. OO}{ 0. 0847) i) 4am 


58 


VI. SUMMARY AND CONCLUSIONS 


This thesis consider the problem of estimating M(t), the 
mean number of customers being served at time t for an 
M/G/foo queue, using service time and interarrival time data. 


It is assumed that there are no customers being served at 


time O. Two cases are considered. In one the parametric 
form of the service time distribution is assumed Known. in 
this case M(t) is a function of the estimated parameters. 


In the situation in which the arrival rate of the Poisson 
process is also assumed known and the parametric form of the 
service time distribution is exponential, approximation to 
the bias and variance of the estimate are derived. Further, 
Simulation is used to study a normal confidence interval 
procedure. 

For the other case the parametric form of the service 
time distribution is unknown. The empirical distribution of 
the service time distribution is used in the estimate of 
M(t). Mimeaimes herdietonle i Me widtecm che arrival mate A» is 
assumed Known, the distribution of the estimate is derived 
in Appendix A. The bootstrap and jackknife estimates with 
known are studied in Appendix B and C. Simulation was used 
to assess the performance of confidence interval procedures 
using a normal approximation the jackknife and the 
bootstrap. The simulation results for the case in which the 
arrival rate ) is known indicate that: 

(1) The arametric estimation method appears the most 
powfu method when. the parametric assumption is 
correct, but the performance is seriously degraded if 
the assumption is not appropriate. 

(2) When an erroneous parametric Penponemete>) model is 
assumed, the initial estimates of mean number in 
SOR vinG Cmake VOOOL . Hewe Veta = acme 2.7) Ene erroneous 
parametric estimate approaches the same value as the 
other estimates. This is because as t 70 all the 


estimates approach the sample mean of the service 
time data. 


ofS) 


(3) The estimate obtained a using the empirical 
distribution 1s unbiased with a larger Varvanceuraom 
a parametric estimate based on a correct model. 

(4) Simulation results indicate there is not. mullem 
difference between jJackknife and booustrap conta aa. 
interval procedures. 

Go) “ive nonparametric normal confidence interval 
procedure performs as well as. the procedure in (4 
Since the distribution of the estimate is almos 
Symme t rier The improvement by the use of adjusted 
degrees of freedom in | the ackknife an the 
bias~-corrected percentile in the bootstrap is small. 

We now discuss the simulation results for the case in 
which the arrival rate for the Poisson process is also 
unknown and is estimated using interarrival time data. The 
service times are generated from four types of distribution. 
The percentile method for the bootstrap and paired and 
separated techniques for the jackknife were used to 
construct the confidence intervals. Tables I and II, which 
are the results of a parametric confidence interval 
procedure in chapter III, are compared with the results of 
the nonparametric confidence interval procedures. The 
Simulation results indicate that: 

(1) The nonparametric confidence interval procedure worke 
as well as the parametric case, even though irae 
length of the confidence interval is wider thangmme 
pDalameerac .Onme- 

(2) In the overall examination, the percentile method of 


the bootstrap shows the best pertormance. These pam 
Jee ee procedure also has similiar results to the 


©OUStrEap. apPproacn. The results of these two 
nonparametric procedures show the almost. same level 
of performance with the parametric one. However, the 


separated jackknife procedure produces poor results. 

(3) The results of the jackknife procedures produce 
intervals that are always biaseaqmemr anc: Efron 
(1981) reported similiar results. [Ref. 13] 

(4) Since the bootstrap procedures require a large ameume 
of computation, the jackknife is the method of choice 
lad one does not want Ze) do the bootstrap 
GOMputaArL Ons. 

In general, the nonparametric methods of the bootstrap 
and the jackkKnife performs very well, regardless the 
complexing of the estimation problem. Of -eGurce, if the 
parametric estimation method can be applied, the results are 


clearly superior. However, the application of the 
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parametric estimation is a highly limited because the 
wanametric assumption is often difficult to verify. When 
the estimate is simple enough, which is the nonparametric 
estimate when the arrival rate is known, and the asymptotic 
distribution of estimate can be obtained, the nonparametric 
normal confidence interval procedure performs well, and more 
complicate computations such as the jackknife and _ the 
bootstrap method are not required. However, the jackkKnife 
and the bootstrap method have a good performance for the 
more complicated problem in which the arrival rate 1s 
unknown. The bootstrap confidence intervals show the best 
performance but the paired jackknife procedure achieve the 
same level of performance with less computation than the 


bootstrap in this problem. 


eit 


APPENDIX A 


CALCULATING THE BIAS AND THE VARIANCE OF M,y(T) WITH KNOWN 
ARRIVAL RATE 


It will be assume that the rate of Poisson process aaa. 
Known and is equal to l. The nonparametric estimate My(t) 


Ls) Given py 
ae 
My(t) = \°[1-F(s)]ds (Ale 
Using the empirical cumulative density function ~ Fag the 


nonparametric estimate is 


=~ 


1X mK. 
My(t) = * 2545+ a c (A. 2) 


where the observation Su § are the order statistics of the 
indeprndent and identically distributed service times with 
unknown distribution F. lo find the distribution of My@ae 
we will study the distribution of { Quy} - 

Let Des cat ten be independent, identically distributed 
random variable with distribution function ©. Let N-' be the 
number of X;°s which are less than t. Let i idenc-cs i* 
Smallest X;. By the definition of conditional probabilig 


P{X, < X|Ne=1} S Tae cates le (A. 3) 
Py Na 217 
LOm —Xst. Since the random variable Ny has a binemuam 
distribution with a parameter F(t), we can rewrite the 


equation A.3 to obtain 


— n-| 
A C 
es X|N,=1} = OO EM CC Fee A 


(*) Fu) C Fd 3") 


ey (A. 4) 
Fe) 
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Let f(x) be the density of the distribution function F. The 
Senemtcional probability given N,=z, 
ese ek, * OX), X,<% exe dx | N, =Z] 


eg S8408% Fonda (A.5) 


mot x ,<X,. 
A 
Senet eecme COncretonal diStribution=e6f the values 
of the unordered X,; that lie in (0,t] is that of independent 


random variables with distribution function ~2 hem 2O< <tc. 


A Fae” 
mimes, given N,=K, M(t) has the same distribution asa 
“a 
constant plus the sum ue 1K independent identically 
Spoeri buted random variables. Thus the expectation of M,(T) 


can computed by the property of conditional expectation, 


EIMy(t)] = ELE[My(t) [Nel] (A.6) 


A 
Given N,=K, 


A : ( = 
E[My(t) |N,=K] = [*, Ee ai a= +t (A. 7) 


Since the random variable Ny Mas, ‘a binomial ~ dGistribpution 


with the parameter F(t), 


- FH _ (Edo , m-nFé) 
EIMy(t)] = A e Fra * on Ls 
- j* 
cam) ero (cl) eeamleninere (at ))| C (A. 8) 
To check the bias of the estimate My(t), using the 
integration by part of equation A.1, the true estimate is 
given by 
t 
My(t) = t{1-F(t)] + {“tE(dx) (A.9) 


Gs 


Thus the estimate M,y(t) 1s unbiased. Using conditional 
expectations, the variance is computed by 
Var[My(t)] = EL(My(t) - ELM, t)] )?] 


= E[(My(t) - ELMy(t) {Nel )2] 


+ 2E[(My(t) - ELM t) ING] )(ELMy(t) [Ng] - ElMy(t)] )1 
+ EL(ELM,(t) Ng] - ElMy(t)] )2] (A. 10) 


Computing each of the individual terms, we obtain 
EL(M,(t) - ELMy(t) {Ni] )?) 


c dx 
ay ee ele Gi) - [x ee aren (Ae aaa) 
N Fut) ° Fur) 
and 
BCE Matt). Nel ecew me Met et 
| = t Feo 
=e (00) 15 ine ---- - t]?¢ And 
~ FCt)E(t)( Joe Te - (A. 12) 
where F is the survival distribution fEunctioneo: =. The 


second term of right term of equation A.10 turn out to be 


Zero. Thus the variance of My(t) 1s obtained by sum of two 
equations. The resulting variance is 
Var[M,y(t)] = = { {*x2R(dx) - [ [(*xE(dx)]2+t2F(t)F(t) 
= 4 
- 2tE(t)[ J xF(dx)] } (A.13) 


Notice that the variance estimace of (lee) is equal to 


57 Var( X) as t»x»m. A nonparametric estimate of Var[My(t)] is 
A x A A 
os eS 1 « K K 
Var[My(t)] a SE ee +t Saree Pe! 
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A\ 


” 
kX es 

egte=— | oa Ss. Ae 
5 ee ( ) 


A 
where K is the number of service times that are less than t. 


ens, 


APPENDIX B 
JACKKNIFE ESTIMATE OF NONPARAMETRIC ESTIMATE 


It will be assumed that the arrival rate A is known and 


equal to l. The nonparametric estimate is given by 
A 
A 
M(t) = ye es ae (30 
N NW & rn 
where Say $ are the order statistics of the service times and 
assumed that the random variable K exists such that S45 Gay 
As 
. £85,/St¢ S 4..29 ; The estimate M*(t) for the data set 
(K) (iH) nN) . n-t 
obtained by deleting tne i point from the sample is 
oak = \ Wis = K . 
ee) m-\ = ori. n-\ c Pe 1 > K 
S A 
\ - ; : 
es Se es if if‘kK (B.2) 
N-\ tA 2 n-| 


a 
The pseudo-value M,(t) is computed by 
A 


A aur 
M,(t) = nM, (t) - (n-1)M,4(€) (B. 3) 


A 
where Le oa 1s the estimate of Myj(t) based on all the adaua 
As 
substituting the estimate OI in equation B.3, we get 


“A 
M.(t) =( Sy, nla cea aed 56 (B. 4) 
12 ae eee a 
Since the jackknife estimate is the average of the 


pseudo-values, the estimate is given by 


A 1 K 
MEG) sae 2 


= = eS: 
Naa &) rn (Ba 
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Thus, the jackknife estimate 1s the same as’ the original 
nonparametric estimate of My(t). The jackknife estimate of 


variance 1s 


A | pe ay, 
Var[M,(t)] = age eats M(t) }? (SS eu SS (B. 6) 
where A 
2 
= (M(t) 3 s8- (n-K)t 
and, A 
Sy ae : r\t]2 
eee) oa [ = Susy? ‘(Csts)) 1s] 
mis) Che yarlance can be ahi ale as Aaa 
- Sg es er eee 2 aos K 
ees Banta es ba nv! 
x \ 
- 2t nk, JS B.7 
na Say | (B. 7) 


Comparing equation B.7 with equation A.14, it is seen that 
the jackknife estimate of variance is greater than the 


estimate of equation A.14 by a multiplicative constant shee 


OH 


APPENDIX © 
CONDITIONAL DISTRIBUTION OF BOOTSTRAP ESTIMATE 


It will be assumed that the arrival rate A is known. 


and is equal to l. Let S,,8,,--,5, denote random servis 

times from a CDF F, and let § *«S ¢..4S8S sts S £€..£5 denote 
Uy ay UK) (Kel) ON, 

the corresponding order statistics. The nonparametric 


estimate of | F(s)ds is 
nN 

A { K n- 

ip E a = We Gna = 


i> 


“- t (Cals) 

Let B;, i=1 to n, be independent random variables having@ege 
Same distribution as draws with replacement from (S;,S,,-+7S, 
Ne and let Bey i=l cometh be the corresponding order 
statistics. A bootstrap realization of the nonparametric 


estimate is 


A ey A 
M,(t) Sas a Day t mince 2 I(k, St) 1t (C25 
ha St 
where 
exe ee) YG shpe od oa 
O otherwise 
To compute the distribution of Mp(t), the Laplace transform 
A 
is used |Ref. 12]. The Laplace Cranstorm orm Gea 
A 
Bl exp(=s Ma(7))) = BlElexp( Ss MaCe) tae (C.3) 


by the property of conditional expectations, where N; 1s the 
number of bootstrap samples which are less than or equal to 
oe We compute the right hand side of equation C.3 
separately. First 

El exp(-§ Mg(t)) INe= 11 
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A 


oe ae 
= exp( -§t)[ exp( -§ 47) ~exp( - 8-82) ] (C. 4) 


is computed, where the random variable Nt has a binomial 
distribution with the parameter FE(t)= K . Thus, EEO 
equation C.3 the Laplace transform of M(t) cs p 
Blexp(- M (t)] = exp( ~§t) Zl exp( -¢+) & ~rexp( -§-£+)| 
(Ne Goi , 
= exp( ~§t)[ “fexp(§ 4) E-bexp( -g-R) + BEI" (5) 


Let us define a random variable Y having the following 


distribution 


Su | eae 
ar ( a NAD lieu 21K (Cc. 6) 
A 
= eo abe SIRES 5 
Then the Laplace transform of Y is 
_ Sy te f 
El exp(-§Y)] = 5~ Zexp(-§-4>) + ~prexp(-8-¥) (C.7) 


A 
Thus, Mp, (t) has the same distribution as the sum of n 


independent random variables having the same distribution as 


gs For the fixed time t, given the order statistics, §&,<S§,, 
= <5 <t< <i S j i i 
a; Ne Sep any! the expectation of Mp ( t) is written by 
E[ Mg (t) |data] = nE[ Y| data] 
A 
es n- 
=--Ss + --- t (C. 8) 


Nn iq ) Vv\ 


Thus, the bootstrap estimate is asymptotically an unbiased 


Pereemace of M(t). ~The variance is 
A 
Var[M,(t) |data] = nVar[ Y] CC. 9) 
The variance of Y can be derived using the equation C.6 


Since 
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-—- ; 
E(v?] = 4-E( S82] 2 + D38( hye (C. 10) 


Hence the asymptotic bootstrap variance estimate of M,(t) is 


given by 


A ‘\ N 4 
o | =, el lees 8 a 5 emt 
Var[M,(t)|[data] = . f . 2. $F [ “s = §;,] +t a 
A [ > 
as i an ae 2 
2h = [ x 2 A! } iC. 1 
That is, the asymptotic bootstrap estimate of variance is 


the same as the nonparametric estimate (equation A.14). 
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